Conditional Probability as a Measure of Volatility Clustering in Financial Time Series 
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In the past few decades considerable effort has been expended in characterizing and modeling 
financial time series. A number of stylized facts have been identified, and volatility clustering or 
the tendency toward persistence has emerged as the central feature. In this paper we propose an 
appropriately defined conditional probability as a new measure of volatility clustering. We test this 
measure by applying it to different stock market data, and we uncover a rich temporal structure 
in volatility fluctuations described very well by a scaling relation. The scale factor used in the 
scaling provides a direct measure of volatility clustering; such a measure may be used for developing 
techniques for option pricing, risk management, and economic forecasting. In addition, we present 
a stochastic volatility model that can display many of the salient features exhibited by volatilities 
of empirical financial time series, including the behavior of conditional probabilities that we have 
deduced. 
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I. INTRODUCTION 

Forecasting from time series data necessarily involves 
an attempt to understand uncertainty; volatility or the 
standard deviation is a key measure of this uncertainty 
and is found to be time-varying in most financial time 
series. The seminal work of Engle pj, that first treated 
volatility as a process rather than just a number to es- 
timate, led to tremendous efforts in devising dynamical 
volatility models in the last two decades. These are of 
great importance in a variety of financial transactions in- 
cluding option pricing, portfolio and risk management. 
Excess volatility (well beyond what can be described 
by a simple Gaussian process) and the associated phe- 
nomenon of clustering [E Q are believed to be the key 
factors underlying many empirical statistical properties 
of asset prices, characterized by a few key "stylized facts" 
0, IE IE 13 described later. A good measure of volatility 
clustering (roughly speaking, large and small changes in 
asset prices are often followed by large and small changes 
respectively) is thus important for understanding finan- 
cial time series and for constructing and validating a good 
volatility model. The most popular characterization of 
volatility clustering is the correlation function of the in- 
stantaneous volatilities evaluated at two different times, 
which shows persistence up to a time scale of more than 
a month, ft has also been established that there is link 
between asset price volatility clustering and persistence 
in trading activity (for an extended empirical study on 
this, see Ref. jg). However, the underlying market mech- 
anism for volatility clustering is not clear. The aim of 
our paper is not to elucidate the mechanism for volatility 
clustering, but to introduce a more direct measure of it. 
Specifically, we propose that the conditional probability 
distribution of asset returns r over a period T (given the 
return, r p , in the previous time period )can be fruitfully 



used to characterize clustering. This is a direct measure 
based on return over a time lag instead of instantaneous 
volatility and we believe is more relevant to volatility 
forecasting. We analyze stock market data using this 
measure, and we and have found that the conditional 
probability can be well described by a scaling relation: 
P(r\r p ) = w ^ r ; f(r/w(r p j). This scaling relation char- 
acterizes both fat tails and volatility clustering exhibited 
by financial time series. The fat tails are described by 
a universal scaling function f(z). The functional form 
of the scaling factor w(r p ), on the other hand, contains 
the essential information about volatility clustering on 
the time scale under consideration. The scaling factors 
we obtain from the stock market data allow us to iden- 
tify regimes of high and low volatility clustering. We also 
present a simple phenomenological model which captures 
some of the key empirical features. 

The key "stylized facts" about asset returns include 
the following: The unconditional distribution of returns 
shows a scaling form (fat tail). The distribution of re- 
turns r in a given time interval (defined as the change in 
the logarithm of the price normalized by a time-averaged 
volatility) is found to be a power law P(\r\ > x) ~ x~ Vr 
with the exponent r] r ~ 3 for U.S. stock markets |oL ITo|. 
well outside the Levy stable range of to 2. This func- 
tional form holds for a range of intervals from minutes to 
several days while for larger times the distribution of the 
returns is consistent with a slow crossover to a normal 
distribution. Another key fact is the existence of volatil- 
ity clustering in financial time series that is by now well 
established [1 IE IE El ; it can be seen, for example, in 
the absolute value of the return |r|, which shows positive 
serial correlation over long lags (the Taylor effect [T^). 
This long memory in the autocorrelation in absolute re- 
turns, on a time scale of more than a month, stands in 
contrast to the short-time correlations of asset returns 
themselves. Fat tails have been the subject of intense 
investigation theoretically from Mandelbrot's pioneering 
early work 2] using stable distributions to agent-based 
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models of Bak et al. and LuxQ (See Ref. [13 
for a survey of research on agent based models used in 
finance). The key problem is to elucidate the nature of 
the underlying stochastic process that gives rise to both 
volatility clustering and the power-law (fat) tails in the 
distribution of asset returns. 



II. CONDITIONAL PROBABILITY AND 
SCALING FORM 

In an effort to seek a direct quantitative characteriza- 
tion of clustering we consider P(r\r p ), the probability of 
the return r in a time interval of duration T, conditional 
on the absolute value of the return r p in the previous 
interval of the same duration. (We emphasize that the 
probability is not conditioned on the value of the return 
at an instant.) By varying T, we can check volatility 
clustering on different time scales. There is a growing 
literature on conditional measures of distribution for an- 
alyzing financial time series (for a review, see Ref. [Isj 
and references therein). For example, the conditional 
probability of return intervals has been used recently to 
study scaling and memory effects in stock and currency 
data p^ . 

We have analyzed both the high frequency data and 
daily closing data of stock indices and individual stock 
prices using the conditional probability as a probe. Here 
we only present results of our analysis of high frequency 
data of QQQ (a stock which tracks the NASDAQ 100 
index) from 1999 to 2004 and daily closing data of the 
Dow Jones Industrial Average from 1900 to 2004. We 
emphasize that the properties of the financial time se- 
ries we present are rather general: we have checked that 
the same properties are also exhibited in other stock in- 
dices and future data (for example, the Hang Seng In- 
dex, Russell 2000 Index, and German government bond 
futures) as well as individual stocks. We have checked, 
as was found in the previous studies |9(, that the prob- 
ability distribution of the returns in the time intervals 
T = 1, 2, 4, 8, 16, 32 days for DJIA exhibits a fat power- 
law tail with an exponent close to —4; this appears to be 
true for most stock indices and individual stock data. 

We calculate P(r\r p ), by grouping the data into dif- 
ferent bins according to the value of r p . In Figure 1(a) 
we display P(r,T\r p ,T) for T = 5 minutes for different 
values of r p . It is clear from the figure that there is 
a positive correlation between the width of P(r\r p ) and 
r p . What is more interesting is that, when r is scaled 
by the width of the distribution (the standard devia- 
tion of the conditional return), w{r p ), the different curves 
of conditional probability collapse to a universal curve: 
P(r\r p ) = w(r p )~ 1 P(r/w(r p )). Evidence for this is dis- 
played in Fig. 1(b). Note that on the time scales we have 
analyzed, the probability distribution is symmetric with 
respect to r. Consequently, in Fig. 1(b) we have only 
displayed the absolute value of the return. The data 
collapse is good for a wide range of T, and the curves 
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FIG. 1: (a) Conditional probability of return for T — 5 min- 
utes in QQQ. Different curves correspond to 10 different ab- 
solute values of the return r p in the previous interval, which 
are groups of bins centered at values ranging from 8.4 x 10 -4 
to 0.011. The larger the value of r p , the large the width of the 
distribution, (b) The conditional probability distribution of 
return of QQQ (shown in (a)), when scaled by a scale factor 
w(r p ), collapses to a universal curve. r p is the absolute value 
of the return in the previous interval. The tail of the proba- 
bility distribution can be described by a power law with the 
exponent approximately equal to —4. 



display a power-law tail with a well-defined exponent of 
approximately —4. 

We examine next the behavior of the scale factor w(r p ) 
on r p . Fig. 2 shows a plot of the scale factor w(r p ) vs. r p 
for different values of T. It can be seen from the figure 
that there is a crossover value r c (T): for r p < r c (T), w is 
almost constant, while for r > r c (T), w increases with r p . 
The degree of the dependence of w on r p can be taken 
as an indication of strength of volatility clustering. If 
there is no volatility clustering w will not depend on r p . 
Note that there is a strong clustering at small T. As T 
increases, the strength of clustering gradually decreases, 
indicating a crossover to the non-clustering regime. As 
T increases beyond the time scale of volatility cluster- 
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FIG. 2: The scale factor w(r p , T) vs r v (the absolute value of 
the return in the previous interval) for different values of T, 
arising from analysis of QQQ data. The dependence is seen 
to be almost linear for sufficiently large r p 

ing, the clustering disappears. This crossover can not 
be seen in the QQQ data as the time scales involved are 
small. Our analysis of DJIA data show an indication of 
such crossover at the time scale of a few months. In this 
paper, we do not separate the cases of positive and neg- 
ative returns in the previous time interval. Thus we do 
not show explicitly the well-known leverage effect, first 
expounded by Black 0] . We have checked that the scal- 
ing and data collapse we obtained are equally valid when 
we separate out the cases of positive and negative returns 
in the previous interval. The leverage effect is reflected 
in the scaling factor w(r p ), which shows w(—r p ) > w(r p ) 
for r p > in the real data. 

Figure 3 shows that the same scaling form is also ex- 
hibited in DJIA data. We have checked that the data 
collapse extends also to data for different values of T in 
addition to different values of r p displayed here. 

The data collapse we have displayed for different r p and 
different T, the power-law behavior including the value of 
the exponent, and the behavior of the scale factor which 
encapsulates features of volatility clustering are the same 
across data from several other stock indices listed earlier 
and individual stocks. This empirical universality can be 
stated as 

P ( r K) = Z7^ T s f(r/w(r p ,T)). (1) 

Here f(z) is a universal function describing the universal 
fat tail in the distribution. f(z) satisfies f(z) — > constant 
as z — > 0, f(z) 1/z 4 as z > 1, and J °° f(z)dz = 1. 
The dependence of w(r p ,T) on r p , on the other hand, 
describes the volatility clustering at the time scale T. 
If w(r p ) is a constant (independent of r p ), then P{r\r p ) 
does not depend on r p , and there is no volatility correla- 
tion or clustering. The conditional probability distribu- 
tion contains information about the conditional average 
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FIG. 3: Conditional probability of return for T = 2 and T = 
20 days in DJIA data. The data corresponding to T — 20 have 
been shifted to the right for easy viewing. Different curves 
correspond to 8 different absolute values of the return r v in 
the previous interval. The inset shows the dependence of the 
width w(r p ) on r p . The tail of the probability distribution can 
be described by a power law with the exponent approximately 
equal to —4. 

of the moments < r q > r of the distribution as well as 
various volatility correlation functions such as < r 2 r p >. 
Given the scaling form we can evaluate these averages 
and correlation functions in terms of w(r p ), which is it- 
self given by w(r p ) = ^/< r 2 >r p - In particular, we have 
the moments of the conditional probability distribution 
given by M q (r p ) —< r q >r p — C q w q {r p ) (C q is a univer- 
sal constant) and < r 2 r 2 >= f dr p Q(r p )w 2 (r p )r 2 , where 
Q(r) is the unconditional probability distribution of the 
return. We believe that this scaling form provides a new 
and rather complete measure of volatility clustering. 

III. MODEL AND DISCUSSION 

In the following we will provide the outline of a model 
that captures the key features exhibited in the condi- 
tional probability distribution of stock market data. In 
a stochastic volatility model, the one-step asset return 
at time t is written as A t = StZt, where Zt is a Gaussian 
random variable with zero mean and unit variance and St 
is magnitude of the price change. For the relatively short 
time scales we are interested in we have set the intrinsic 
growth rate to zero. The distribution of r depends on 
the dynamics of St'- Slow changes in Si lead to volatility 
clustering. 

There exist a few classes of volatility models that have 
been used to describe the dynamics of S t . These include 
the widely used models based on GARCH-like processes 
, and more recently, the models based on a multifractal 
random walk (MRW) [2(| that will be discussed later. 
In our model, the dynamics of S(t) is specified via the 
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random variable n(t), with S(t) = 5o7™^- In order to 
describe both the behavior of probability distributions 
and temporal correlations we have devised the following 
model for the evolution of n(t). The time evolution of the 
variable n is assumed to be independent of the change 
in S(t) and n(t) executes a random walk with reflecting 
boundaries: We enforce the condition n(t) > 0; thus 5q is 
the minimum value of S(t). An upper bound in n, n max , 
can also be incorporated without affecting the scaling 
behavior of the model. We typically choose 7™™" ~ 30. 
The change in n(t), Sn(t) is given by 

Sn(t) = 7i t + a{J2[K(i + 1) - K(i)]rH-i + K{l) m 
i=i 

-K(N c + l)r) t ^ Na } - prj. (2) 

In the preceding {rjj} are independent random variables 
that assume the value +1 with probability p and —1 with 
probability 1— p. This asymmetry builds in the tendency 
to decrease the volatility. The mean value of rji, 2p~l < 
is denoted by rj. We comment on the implications of the 
different terms next. 

We focus on the limit a = and j3 = first since it is 
amenable to analytic investigation; this model is related 
to a model discussed in Ref. |2^|. Note that this limit 
already builds in volatility clustering as it takes many 
steps to change n(t) significantly. It is easy to show that, 
the steady-state probability distribution of n is given by 
P(n) = (l-e- x )e- Xn ~ Ae~ A ™, where A = ]n((l-p)/p). 
The distribution of S(t) is then given by a power-law, 
P(S) ~ (5- A / ln 7-i This mechanism for generating a 
power-law distribution was first noted by Herbert Simon 
[16j in 1955. We have studied this limiting case of the 
model numerically and find that many features of the 
conditional probability distribution exhibited by the real 
data including the power law and scaling behaviors are 
reproduced. We can show analytically that the condi- 
tional probability distribution exhibits scaling collapse, 
and that scale-invariant behavior with a power law tail 
(with the exponent —4 if we choose p = 1/ (1 +7 2 )) exists 
for r > <t c , where a c — cro7 < - T /' 5t ^ 1-2p ) . The numerical 
data in fact show a somewhat larger range of power-law 
behavior. The re-scaling factor required for data collapse 
is simply proportional to r p from our analysis, as we have 
observed from the real data and from numerical simula- 
tions of model when r p is not too small. The simple 
limit captures important features of volatility clustering 
reflected in conditional probability distributions. 

The second term in Eq. is based on the multifractal 
random walk model that builds in long-time correlations 
via a logarithmic decay of the log volatility correlation 
(log|r(£ + r)| log |r(t)|). This term allows us to repro- 
duce the more subtle temporal autocorrelation behavior 
observed in the data and follows the implementation in 
Ref. |2l| . The long-term memory effects are incorporated 
by making the change in n(t) depend on the steps r\ t -i 
at earlier times with a kernel given by K(i) = \j\Ti 
(this corresponds to the MRW part of the model given 
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FIG. 4: The scaled conditional probability distributions of 
return for the mixed model given by Eq. (2) with 7 = 1.05, 
ao = 0.1, and ft = 1.3. The time lag is T = 10. The curves, 
corresponding to different absolute values of the return r p in 
the previous interval collapse on to a universal curve when 
scaled by a scale factor w(r p ). The tail of the probability dis- 
tribution is again described by a power law with the exponent 
equal to —4. The inset shows the dependence of w(r p ) on r v . 



by n(t) — aJ2i=i K(i)Vt—i) an d allowing memory up to 
N c time steps, chosen to be 1000 in our simulations. The 
final term allows us to control the rate of drift to lower 
values of n. We have simulated this model with ao = 0.1 
(a = a /hi(7)) and /3 w 1.3 for 7 = 1.05 and displayed 
the results for P(r\r p ) in Figure 4. The model with the 
stated parameters reproduces the fat tail in the uncondi- 
tional probability distribution for r observed in the data. 
The non-universal scale factor w(r p ) is similar to those 
found from our empirical analysis. We have also checked 
that this model retains the same temporal behavior in 
the log-volatility correlation exhibited by the pure MRW 
model. Thus the model we have investigated is capa- 
ble of reproducing both probability distributions (condi- 
tional and unconditional) and temporal autocorrelations. 
We note in passing that the model as it stands cannot 
be used to study the leverage effect; however, it can be 
modified to do so. 

In summary, we have proposed a direct measure of 
volatility clustering in financial time series based on the 
conditional probability distribution of asset returns over 
a time period given the return over the previous time 
period. We discovered that the conditional probability 
of stock market data can be well described by a scal- 
ing relation, which reflects both fat tails and volatility 
clustering of the financial time series. In particular, the 
strength of volatility clustering is reflected in the func- 
tional form of the scaling factor w(r p ). By extracting 
w(r p ) from market data, we are able to estimate the fu- 
ture volatility over a time period, given the return in the 
previous period. This may be useful in modelling finan- 
cial transactions including option pricing, portfolio and 
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risk management; all these depend crucially on volatility 
estimation. The clustering of activities and fat tails in 
the associated distribution are very common in the dy- 
namics of many social l24l and natural phenomena (e.g. 
earthquake clustering |25j). The conditional probability 



measure we have presented in this paper may serve as 
a useful tool for characterizing other clustering phenom- 
ena. 

This work was supported by the National University 
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